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Abstract 

The margin of victory of an election is a useful measure to capture the robustness of an election outcome. It also 
plays a crucial role in determining the sample size of various algorithms in post election audit, polling etc. In this 
work, we present efficient sampling based algorithms for estimating the margin of victory of elections. 

More formally, we introduce the (c, e, 5 )-Margin OF VICTORY problem, where given an election £ onn voters, 
the goal is to estimate the margin of victory M{£) of £ within an additive factor of cM{£) + en. We study the 
(c, e, (5 )-Margin of Victory problem for many commonly used voting rules including scoring rules, approval, 
Bucklin, maximin, and Copeland™. We observe that even for the voting rules for which computing the margin of 
victory is NP-Hard, there may exist efficient sampling based algorithms, as observed in the cases of maximin and 
Copeland™ voting rules. 


1 Introduction 

In many real life applications, there is often a need for a set of agents to agree upon a common decision although 
they may have different preferences over the available candidates to choose from. A natural approach used in these 
situations is voting. Some prominent examples of the use of voting rules in the context of multiagent systems include 
collaborative filtering [Pennock et al., 2000], personalized product selection [Lu and Boutilier, 2011] etc. 

In a typical voting scenario, we have a set of votes each of which is a complete ranking over a set of candidates. 
We also have a function called voting rule that takes as input a set of votes and outputs a candidate as the winner. A 
set of votes over a set of candidates along with a voting rule is called an election and the winner is called the outcome 
of the election. 

Given an election, one may like to know how robust the election outcome is with respect to the changes in 
votes [Shiryaev et al., 2013, Caragiannis et al., 2014, Regenwetter et al., 2006]. One way to capture robustness of an 
election outcome is to compute the minimum number of votes that must be changed to change the outcome. This idea 
of robustness is captured precisely by the notion called margin of victory. The margin of victory of an election is the 
smallest number of votes that need to be changed to change the election outcome. In a sense, an election outcome is 
considered to be robust if the margin of victory is large. 

1.1 Motivation 

In addition to being interesting purely because of theoretical reasons, the margin of victory of an election plays a 
crucial role in many practical applications. One such example is post election audits — methods to observe a cer¬ 
tain number of votes (which is often selected randomly) after an election to detect an incorrect outcome. There can 
be a variety of reasons for an incorrect election outcome, for example, software or hardware bugs in voting ma¬ 
chine [Norden and Law, 2007], machine output errors, use of various clip-on devices that can tamper with the memory 
of the voting machine [Wolchoket al., 2010], human errors in counting votes. Post election audits have nowadays 
become common practice to detect problems in electronic voting machines in many countries, for example, the US. 
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As a matter of fact, at least thirty states in the US have reported such problems by 2007 [Norden and Law, 2007]. 
Most often, the auditing process involves manually observing some sampled votes. Researchers have subsequently 
proposed various risk limiting auditing methodologies that not only minimize the cost of manual checking, but also 
limit the risk of making a human error by sampling as few votes as possible [Stark, 2008a, Stark, 2008b, Stark, 2009, 
Sarwate et al., 2011]. The sample size in a risk limiting audit critically depends on the margin of victory of the election. 

Another very important application where the margin of victory plays an important role is polling. In polling, the 
pollster samples a certain number of votes from the population and predicts the outcome of the underlying election 
based on the outcome of the election on the sampled votes. One of the most fundamental questions in polling is: how 
many votes should be sampled? It turns out that the sample complexity in polling too crucially depends on the margin 
of victory of the election from which the pollster is sampling [Canetti et al., 1995, Dey and Bhattacharyya, 2015]. The 
number of samples used in an algorithm is called the sample complexity of that algorithm. As the above discussion 
shows, computing the margin of victory of an election is often a necessary task in many practical applications. How¬ 
ever, one cannot observe all the votes in many applications including the ones discussed above. For example, in a 
survey or polling, one cannot first observe all the votes to compute the margin of victory and then sample a few votes 
based on the computed margin of victory. Hence, one often needs a “good enough ” estimate of the margin of victory 
by observing a few votes. We, in this work, precisely address this problem: estimate the margin of victory of an 
election by sampling as few votes as possible. 

1.2 Our Contributions 

Let n be the number of votes, m the number of candidates, r any voting rule. We introduce and study the following 
computational problem in this paper^: 

Definition 1. ((c, e, (5)-Margin of Victory (MoV)) 

Given a r-election £, determine Mr{£), the margin of victory of£ with respect to r, within an additive error of at most 
cMr{£) + en with probability at least 1 — <5. The probability is taken over the internal coin tosses of the algorithm. 

We call the parameter (c, e) in Definition 1 the approximation factor of the problem. The notion of approx¬ 
imation in Definition 1 is a hybrid of what are classically known as additive and multiplicative approximations 
(see [Vazirani, 2001]). However, Corollary 1 shows that, there does not exist any estimator with sample complexity 
independent of n and achieves o(n) additive approximation. Again, we can not hope to have an estimator with sample 
complexity independent of n that guarantees good multiplicative approximation ratio since there exist elections with 
margin of victory only one. This justifies the problem formulation in Definition 1. 

Our goal here is to solve the (c, e, (5)-MoV problem with as few sample votes as possible. Our main technical 
contribution is to come up with efficient sampling based polynomial time randomized algorithms to solve the (c, e, i5)- 
MoV problem for various voting rules. Each sample reveals the entire preference order of the sampled vote. The 
specific contributions of this paper are summarized in Table 1 . 

Table 1 shows a practically appealing positive result- the sample complexity of all the algorithms presented here is 
independent of the number of voters. We also present lower bounds on the sample complexity of the (c, e, (5)-MoV 
problem for all the common voting rules which matches with the upper bounds when we have a constant number of 
candidates. Moreover, the lower and upper bounds on the sample complexity match exactly for the fc-approval voting 
rule irrespective of number of candidates, when fc is a constant. The specific contributions of this paper are as follows. 

- We show a sample complexity lower bound of log j) for the (c, e, (5)-MoV problem for all the commonly 
used voting rules, where c G [0,1) (Theorem 2 and Corollary 1). 

- We show a sample complexity upper bound of 0{-^ log for the (|, e, (5)-MoV problem for arbitrary scoring 
rules (Theorem 3). However, for a special class of scoring rules, namely, the fc-approval voting rules, we have a 
sample complexity upper bound of 0(^ log |) for the (0, e, (5)-MoV problem (Theorem 4). 

One key finding of our work is that, there may exist efficient sampling based polynomial time algorithms for 
estimating the margin of victory, even if computing the margin of victory is NP-Hard for a voting rule [Xia, 2012], as 
observed in the cases of maximin and Copeland" voting rules. 

’ Throughout this section, we use standard terminlogy from voting theory. For formal definitions, refer to Section 2. 
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Voting Rule 

Sample complexity 

Scoring rules 

(i, (5)-MoV, ^ In 2^ [Theorem 3] 

(c', £, d)-MoV't', 

(1-^)^ Inf ^ ] 

/c-approval 

(0, £, (i)-MoV, ^ In [Theorem 4] 

Approval 

(0, e, (5)-MoV, ^ In 2^, [Theorem 5] 

Buck] in 

(2, e, (5)-MoV, ^ In 2|2 i, [Theorem 6] 

36e^ Y 8ey/nS J ’ 

[Corollary 1] 

Maximin 

(2, e, (5)-MoV, 2| In 2|i^ [Theorem 7] 

Copeland" 

(l - O <^)-MoV, f In 2f, [Theorem 8] 


Table 1; Sample complexity for the (c, e, (5)-MoV problem for various voting rules. fThe result holds for any c' € 
[ 0 , 1 ). 


1.3 Related Work and Discussion 

Magrino et al. [Magrino et al., 2011] presents approximation algorithms to compute the margin of victory for the 
instant runoff voting (IRV) rule. Cary [Cary, 2011] provides algorithms to estimate the margin of victory of an IRV 
election. Xia [Xia, 2012] presents polynomial time algorithms for computing the margin of victory of an election for 
various voting rules, for example the scoring rules, and proved intractability results for several other voting rules, for 
example the maximin and Copeland" voting rules. Endriss et al. [Endriss and Leite, 2014] computes the complexity 
of exact variants of the margin of victory problem for Schulze, Cup, and Copeland voting rules. However, all the 
existing algorithms to either compute or estimate the margin of victory need to observe all the votes, which defeats the 
purpose in many applications including the ones discussed in Section 1.1. We, in this work, show that we can estimate 
the margin of victory for many common voting rules quite accurately by sampling a few votes only. Moreover, the 
accuracy of our estimation algorithm is good enough for many practical scenarios. Eor example. Table 1 shows that it 
is enough to select only 3600 many votes uniformly at random to estimate of a plurality election within an error of 
0.1 with probability at least 0.99, where n is the number of votes. We note that in all the sampling based applications 
discussed in Section 1.1, the sample size is inversely proportional to [Canetti et al., 1995] and thus it is enough to 
estimate accurately. 

The margin of victory problem is the same as the optimization version of the destructive bribery problem intro¬ 
duced by [Ealiszewski et al., 2006, Ealiszewski et al., 2009]. However, to the best of our knowledge, there is no prior 
work on estimating the cost of bribery by sampling votes. 

Organization. We formally introduce the terminologies in Section 2; we present the results on sampling com¬ 
plexity lower bounds in Section 3; we present polynomial time sampling based algorithms in Section 4; finally, we 
conclude in Section 5. 
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2 Preliminaries 


Let V = , '^n} be the set of all votes and C = {ci,..., Cm} the set of all candidates. If not mentioned 

otherwise, m and n denote the number of candidates and the number of voters respectively. Each vote )^i is a complete 
order over the candidates in C. For example, for the candidate set C = {a, b}, a >ib means that the vote prefers 
a to b. We denote the set of all complete orders over C by C{C). Hence, CiC)'^ denotes the set of all n-voters’ 
preference profiles >-= (>-i,..., >-n)- A map r : U„ |(; 7 |gN+E(C')"' — 2^ is called a voting rule. Given a vote 
profile >-G £((7)", we call the candidates in the set r{>-) the winners. The pair {y,C) is called an r-election £ if the 
voting rule used is r. 

Examples of some common voting rules are as follows. 

Positional scoring rules: A collection of vectors where Sm = {cvi, 0 : 2 , ■ ■ ■, ctm) € is a m- 

dimensional vector with ai > a 2 > ■ • ■ > am and ai > am for every m € N, naturally defines a voting rule - 
a candidate gets score ai from a vote if it is placed at the position, and the score of a candidate is the sum of 
the scores it receives from all the votes. The winners are the candidates with maximum score. Scoring rules remain 
unchanged if we multiply every ai by any constant A > 0 and/or add any constant Hence, we can assume without 
loss of generality that in every score vector a, there exists a j with aj — a^+i = 1 and ai = 0 for all i > j. We call 
such a vector a a normalized score vector. 

The vector a that is 1 in the first k coordinates and 0 elsewhere gives the k-approval voting rule. 1-approval is 
called the plurality voting rule. The score vector (m — 1, m — 2,..., 1, 0) gives the Borda voting rule. 

Approval: In approval voting, each vote approves a subset of candidates. The winners are the candidates which 
are approved by the maximum number of votes. 

BuckUn: A candidate cc’s Bucklin score is the minimum number (. such that at least half of the votes rank x in 
their top t positions. The winners are the candidates with lowest Bucklin score. 

Maximin: Given an election £ and any two candidates x and y, the quantity D£{x, y) is defined as Ng^x, y) — 
Ns{y, x), where Ns{x, y) (respectively Ns{y, x)) is the number of votes which prefer x to y (respectively y to x). 
The maximin score of a candidate x is miny^x Ds{x, y). The winners are the candidates with maximum maximin 
score. 

Copeland": The Copeland" score of a candidate x is |{y ^ x : £>£(x, y) > 0}| -f a|{y ^ x : Ds{x, y) = 0}|, 
where a G [0,1]. The winners are the candidates with the maximum Copeland" score. 

For score based voting rules (all the voting rules mentioned above are score based), we denote the score of any 
candidate x G C by s(x). Given an integer t, we denote the set {1,...,/} by [/]. The notion of margin of victory of 
an election is defined as follows. 

Definition 2. (Margin of Victory (MoV)) 

Given an election £ = (>-,(7) with voting rule r, the margin of victory of £, denoted by Mr{£), is the minimum 
number of votes that should be changed to change the winning set r(>-). 

2.1 Chernoff Bound 

We repeatedly use the following concentration inequality: 

Theorem 1. Let Xi ,..., be a sequence of £ independent random variables in [0,1] (not necessarily identical). Let 

S = Xi and let p, = ¥, [5']. Then, for any 0 < (5 < 1." 

Pr[|S' — p\ > Sp] < 2exp(—(5^/r/3) 


3 Sample Complexity Lower Bounds 

Our lower bounds for the sample complexity of the (c, e, (5)-MoV problem are derived from the information-theoretic 
lower bound for distinguishing two distributions. We start with the following basic observation. Let A be a random 
variable taking value 1 with probability 5 — £ and 0 with probability ^ -|- £; F be a random variable taking value 1 with 
probability i and 0 with probability i. Then, it is well-known that every algorithm needs at least ^ In many 
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samples to distinguish between X and Y with probability of making an error being at most 5 [Canetti et al., 1995]. 
Immediately, we have; 


Theorem 2. The sample complexity of the (c, e, (5)-MoV problemfor the plurality voting rule is at least ^ In 
for any c G [0,1). 



Proof: Consider two vote distributions X and Y, each over the candidate set {a, 6}. In X, exactly ^ fraction 

of voters prefer a to 6 and thus the margin of victory is In Y, exactly ^ fraction of voters prefer 6 to a and 

thus the margin of victory is one. Any (c, e, (5)-MoV algorithm A for the plurality voting rule gives us a distinguisher 
between X and Y with probability of error at most 25. This is so because, if the input to .4 is AT then, the output of 
A is less than c + 2en with probability at most <5, whereas, if the input to ,4 is F then, the output of A is more than 
c + £71 with probability at most 5. Now, since n can be arbitrarily large, we get the result. □ 


Theorem 2 immediately gives the following corollary. 

Corollary 1. For any c € [0,1), every (c, e, (5)-MoV algorithm needs at least In ^ se^/ws ) samples for 

all voting rules which reduce to the plurality rule for two candidates. In particular, the lower bound holds for scoring 
rules, approval, Bucklin, maximin, and Copeland'^ voting rules. 

We note that the lower bound results in Theorem 2 and Corollary 1 do not assume anything about the sampling 
strategy or the computational complexity of the estimator. 


4 Sampling Based Algorithms 

A natural approach for estimating the margin of victory of an election efficiently is to compute the margin of victory 
of a suitably small number of sampled votes. Certainly, it is not immediate that samples chosen uniformly at random 
preserve the value of the margin of victory of the original election within some desired factor. Although it may be 
possible to formulate clever sampling strategies that tie into the margin of victory structure of the election, we will 
show that uniformly chosen samples are good enough to design algorithms for estimating the margin of victory for 
the voting rules studied here. Our proposal has the advantage that the sampling component of our algorithms are 
always easy to implement, and further, there is no compromise on the bounds in the sense that they are optimal for any 
constant number of candidates. 

Because our samples are chosen uniformly at random, our analysis relies only on the fact that a sufficiently large 
sample of votes have been drawn. Our algorithms involve computing a quantity (which depends on the voting rule 
under consideration) based on the sampled votes, which we argue to be a suitable estimate of the margin of victory of 
the original election. This quantity is not necessarily the margin of victory of the sampled votes. For scoring rules, for 
instance, we will use the sampled votes to estimate candidate scores, and we use the difference between the top two 
candidate scores (suitably scaled) as the margin of victory estimate. We also establish a relationship between scores 
and values of the margin of victory to achieve the desired bounds on the estimate. The overall strategy is in a similar 
spirit for other voting rules as well, although the exact estimates may be different. We now turn to a more detailed 
description, although some proofs are omitted due to lack of space. 

4.1 Scoring Rules and Approval Voting Rule 

We begin with the class of scoring rules. Interestingly, the margin of victory of any scoring rule based election can still 
be estimated quite accurately by sampling only ^ In many votes. An important thing to note is that, the sample 
complexity upper bound is independent of the score vector. Before embarking on the proof of this general result, we 
prove a structural lemma which will be used crucially in the subsequent proof. 

Lemma 1. Let a = (ai, ..., am) be ciny normalized score vector (hence, am = 0). If w and z are the candidates 
that receive highest and second highest score respectively in a a-scoring rule election instance £ = {V, C), then, 

ai{Ma{£) — 1) < s{w) — s{z) < 2aiMa{£) 
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Proof: Let Ma{£) be the margin of victory of £. We claim that there must be at least Ma{£) — 1 many votes 
V G V where w is preferred over z. Indeed, otherwise, we swap w and z in all the votes where w is preferred over 
z. This makes z win the election. However, we have changed at most Ma{£) — 1 votes only. This contradicts the 
definition of margin of victory (see Definition 2). Let u S F be a vote where w is preferred over z. Let and 
C(i) be the scores received by the candidates w and z respectively from the vote v. We replace the vote v by 
v' = z c. This vote change reduces the value of s{w) — s(z) by ai + ai — aj which is at least ai. Hence, 

ai{Ma{£) — 1) < s{w) — s(z). Each vote change reduces the value of s(w) — s(z) by at most 2ai since am = 0. 
Hence, s(w) — s(z) < 2aiMa(£). □ 


With Lemma 1 at hand, we show our estimation algorithm for the scoring rules next. 

Theorem 3. There is a polynomial time (i e, b)-MoV algorithm for the scoring rules with sample complexity 


Proof: Let a = {ai,..., am) be any arbitrary normalized score vector and £ = (V, C) an election instance. We 
sample i (the value of £ will be chosen later) votes uniformly at random from the set of votes with replacement. 
For a candidate x, define a random variable Xi (x) = ^ if a; gets a score of ai from the zth sample vote. Define 

s[x) = '■bs estimate of s{x), the score of x. Also define e' = |. Now, using Chernoff bound 

(Theorem 1), we have the following. 


Pr[|s(a;) — s(a;)| > aie'n] < 2exp (- — 


We now use the union bound to get the following. 

Pr[3a; e C, |s(x) — s(a;)| > aie'n] < 2m exp 


( 1 ) 


Define M '■b® estimate of the margin of victory of the election £ (and thus the output of the 

algorithm), where w € argmax^g^’I^C^^)} z G argmax^gpy{^j{s(a;)}. We claim that, if Vx G C, |s(a:) — 
s(a;)| < e'n, then \M — Ma{£)\ < jMa{£) + en. This can be shown as follows. 


M - iUS) = MfM _ M„(f) 


< 


L5ai 
s{w) — s(z) 2e'n 


1.5ai 


+ 


1.5 


-M^{£) 


1 


< -Ma{£) +£n 


The second inequality follows from the fact that, s{w) < s{w) + e'n < s{w) + e'n and s(z) > s(z) > s(z) — e'n. 
The third inequality follows from Lemma 1. Similarly, we bound Ma{£) — M as follows. 


Mo,{£) -M = Ma{£) - 

< M4£) - 

1 


s{w) — s(z) 

1.5ai 

s(w) — s(z) 2e'n 


1.5ai 


1.5 


< -Ma{£) +en 


This proves the claim. Now, we bound the success probability of the algorithm as follows. Let A be the event that 

Va; € ( 7 , 15 ( 0 :) — s(a:)| < e'n. 


Pr 


< -M«(f)+en 
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> Pr 




en 


A 


Pr[^] 


= Pi[A] 

> 1 — 2mexp (—e'^£/3) 

The third equality follows from Lemma 1 and the fourth inequality follows from inequality 1 . Now, by choosing 
f ^ In we get a (|, e, 5)-MoV algorithm for the scoring rules. □ 


Now, we show an algorithm for the (0, e, (5)-MoV problem for the /c-approval voting rule which not only provides 
more accurate estimate of the margin of victory, but also has a lower sample complexity. The following lemmas will 
be used subsequently. 

Lemma 2. Let £ = {V,C) be an arbitrary instance of a k-approval election. If w and z are the candidates that 
receive highest and second highest score respectively in £, then, 

‘^{AIk—approval(.£') 1 ) ^ ^ —approval (^) 

Proof: We call a vote v &V favorable if w appears within the top k positions and z does not appear within top the 
k positions in v. We claim that the number of favorable votes must be at least Mk-approvai{£)- Indeed, otherwise, 
we swap the positions of w and z in all the favorable votes while keeping the other candidates fixed. This makes the 
score of z at least as much as the score of w which contradicts the fact that the margin of victory is Mk-approvai{£)■ 
Now, notice that the score of z must remain less than the score of w even if we swap the positions of w and z in 
Mk-approvai{£) — 1 many favorable votes, since the margin of victory is Mk-approvai{£)- Each such vote change in¬ 
creases the score of z by one and reduces the score of w by one. Hence, 2{Mk-approvai{£) — ^) < s{w) — s{z). Again, 
since the margin of victory is Mk-approvai {£), there exists a candidate x other than w and Mk-approvai {£) many votes 
in V which can be modified such that x becomes a winner of the modified election. Now, each vote change can reduce 
the score of w by at most one and increase the score of x by at most one. Hence, s(w) — s(x) < 2Mk-approvai{£) 
and thus s{w) — s(z) < 2Mk-approvai{£) since s(z) > s{x). □ 


Lemma 3. Let f : 


be a function defined by f if) = e *. Then, 

f{x) + f{y) < f{x + y), for x,y > 0, >2,x <y 

x + y 


Proof: For the function f{x), we have following. 


p/ \ _ — rill N _— 2\ _ 

f[x) = e X => / (a;) = e -- 


Hence, for y > x > 0 and > 2, we have f"(x), f"{y), f"{x + y) >0. This implies the following for an 
infinitesimal positive 5. 

fix) < f{y) 

fix -S)- f{x) fjy)- fjy-5) 

(5 — (5 

^fix)+fiy) < fix-S) +f{y + S) 

^fix) + fiy) < fix + y) 


□ 


With Lemma 2 and 3 at hand, we now describe our margin of victory estimator. 

Theorem 4. There is a polynomial time (0, e, (5)-MoV algorithm for the k-approval rule whose sample complexity is 
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Proof: Let S = {V, C) be an arbitrary fc-approval election. We sample £ votes uniformly at random from V with 
replacement. For a candidate x, define a random variable Xi (x) which takes value 1 if a; appears among the top k 
candidates in the sample vote, and 0 otherwise. Define s{x) == j ^ii^) the estimate of the score of the 
candidate x, and let s{x) be the actual score of x. Also define e' = |. Then by the Chernoff bound (Theorem 1), we 
have: 

Pr[|s(a:) - s(a:)| > e'n] < 2exp 
Now, we apply the union bound to get the following. 

Pr[3x G C, |s(a:) — s(a:)| > e'n] 



< 2fcexp (—e'^£/3) (2) 

The second inequality follows from Lemma 3 : The expression ^ maximized subject to the 

constraints that 0 < s{x) < n,\/x € C and when s{x) = n'ix G C for any subset of candidates 

C" C C with \C'\ = k and s{x) = OVx G C \ C. 

Now, to estimate the margin of victory of the given election £, let w and z be candidates with maximum and second 
maximum estimated score respectively. That is, w G argmax^gp{s(x)} and z G argmax^g(;;y|^^j{s(a::)}. We define 

M estimate of the margin of victory of the election £ (and thus the output of the algorithm). Let A 

be the event that \/x G C, |s(a:) — s(a;)| < e'n. We bound the success probability of the algorithm as follows. 

Pr [\M - Mk — approvali£)\ if en] 

> Pr [|M - Mk-approvai[£)\ < en\A] Pr[A] 

= Pr[A] 

> 1 — 2A:exp (—e'^£/3) 

The second equality follows from Lemma 2 and an argument analogous to the proof of Theorem 3. The third inequal¬ 
ity follows from inequality 2. Now, by choosing £ = ^ In we get a (0, e, 5)-MoV algorithm. □ 

Note that, the sample complexity upper bound matches with the lower bound proved in Corollary 1 for the k- 
approval voting rule when fc is a constant, irrespective of the number of candidates. For the approval voting rule, we 
have the following result. 

Theorem 5. There is a polynomial time (0,e, 5)-MoV algorithm for the approval rule with sample complexity 

i^ln2p. 

0 

Proof sketch: We estimate the approval score of every candidate within an additive factor of |n by sampling ^ In ^ 
many votes uniformly at random with replacement and the result follows from an argument analogous to the proofs of 
Lemma 2 and Theorem 4. □ 


4.2 Bucklin Voting Rule 


Now, we consider the Bucklin voting rule. Given an election £ = (C, C), a candidate x € C, and an integer £ G [m], 
we denote the number of votes in V in which x appears within the top £ positions by ne{x). We prove useful bounds 
on the margin of victory of any Bucklin election in Lemma 4. 


Lemma 4. Let £ = (V, C) be an arbitrary instance of a Bucklin election and w be the winner of £. Let us define 
quantity A.{£) as follows. 


A{£) 


min inAw) — nAx) + 1| 

[m— l]:ni(w)'>n/2, 
x^C\{w^:ni{x)<n/2 


a 
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Then, 


-^— < MBucklin{£) < A(f) 

Proof: Pick any i G. [to — 1] and x G C \ {w} such that, rLi,{w) > nj^ and ni(x) < n/2. Now by changing 
ni{w) — [n/2j many votes, we can ensure that w is not placed within the top I positions in more than n/2 votes; 
choose ni(w)— [n/2j many votes where w appears within top I positions and swap w with candidates placed at the last 
position in those votes. Similarly, by changing [n/2j +l — ni{x) many votes, we can ensure that x is placed within top 
positions in more than n/2 votes. Hence, by changing at most nf(r(;)—[n/2j+ [n/2j+l—n^(a:) = n£{w)—ni{x)+l 
many votes, we can make w not win the election. Hence, MBuckUni^) < ne{w) — ni{x) + 1. Now, since we have 
picked an arbitrary i and an arbitrary candidate x, we have MBuckUn{£) < A(f). 

For the other inequality, since the margin of victory is MBuckUn{£), there exists an G [to — 1], a candidate 
X G C \ {tu}, and MBuckUn{£) many votes in V such that, we can change those votes in such a way that in the 
modified election, w is not placed within top i' positions in more than n/2 votes and x is placed within top P positions 
in more than n/2 votes. Hence, we have the following. 


MBucklin{£) > ng{w) 


Tl I 

, MBuckHn{£) > 


n 

- 2 - 


+ 1 - n'iix) 


^ MBuckUn{£) > max{n^/(w) 



+ 1 - n^'(x)} 


MBucklin{£) ^ 


n,.H-[§J + [§J+l 
2 


nB{x) 


^ A(g) 
- 2 


□ 


Notice that, given an election £, A{£) can be computed in polynomial amount of time. Lemma 4 leads us to the 
following Theorem. 

Theorem 6. There is a polynomial time (i e, b)-MoV algorithm for the Bucklin rule with sample complexity 

Proof sketch: Similar to the proof of Theorem 4, we estimate, for every candidate x G C and for every integer ^ G [to] , 
the number of votes where x appears within top i positions within an approximation factor of (0, |). Next, we com¬ 
pute an estimate of A(£) from the sampled votes and output the estimate for the margin of victory as A(f )/1.5. Using 
Lemma 4, we can argue the rest of the proof in a way that is analogous to the proofs of Theorem 3 and 4. □ 


4.3 Maximin Voting Rule 

Next, we show the result for the maximin voting rule. 

Lemma 5. Let £ = (U, C) be any instance of a maximin election. Ifw and z are the candidates that receive highest 
and second highest maximin score respectively in £, then, 

‘^^maximin{£^ s(z) ^ ^^maximin{£) 

Proof: Each vote change can increase the value of s(z) by at most two and decrease the value of s{w) by at most 
two. Hence, we have s{w) — s{z) < ^Mmaximin{£)- Let x be the candidate that minimizes Dsiw^x), that is, 
x G argmin^g^y{^j.{L)f:(w, a;)}. Let u G U be a vote where u> is preferred over a:. We replace the vote u by the vote 
v' = z X w. This vote change reduces the score of w by two and does not reduce the score of z. Hence, 

s(w) ^ ^.^maximin(£)- 1—1 
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Theorem 7. There is a polynomial time (i e, (5)-MoV algorithm for the maximin rule with sample complexity 

Proof sketch: Let £ = {V, C) be an instance of maximin election. Let x and y be any two candidates. We sample i 
votes uniformly at random from the set of all votes with replacement. 

I 1, \f X >- y in the sample vote 

Xi{x,y) = 

[—1, else 

Define Ds {x, y) = j Xi{x, y). By using the Chernoff bound and union bound, we have the following. 

Pr \^x,y € C, \De{x,y) - D£{x,y)\ > m] < 2vn? 

We define M s{w)-s{z) ^ estimate of the margin of victory of £, where w G argmax^g( 3 .{s(x)} and 
z G argmax^g(;^y^jjj{s(a;)}. Now, using Lemma 5, we can complete the rest of the proof in a way that is analo¬ 
gous to the proof of Theorem 3. □ 


4.4 Copeland" Voting Rule 

Now, we present our result for the Copeland“ voting rule. Xia introduced the brilliant quantity called the relative 
margin of victory (see Section 5.1 in [Xia, 2012]) which is a crucial ingredient in our algorithm for the Copeland“ 
voting rule. Given an election £ = (V, C), a candidate x € C, and an integer (may be negative also) t, Sf(y,x) is 
defined as follows. 


s'tiVyx) =\{y €C :yf= x,De{y,x) < 2f}| 

+ a\{y € C ■. y x,D£{y,x) = 2t}\ 

For every two distinct candidates x and y, the relative margin of victory, denoted by RM{x,y), between x and y 
is defined as the minimum integer t such that, sf^{V,x) < s'^iV^y). Let w be the winner of the election £. We 
define a quantity r(£’) to be {w, a;)}. Notice that, given an election £, r(f ) can be computed in a 

polynomial amount of time. Now we have the following lemma. 

Lemma 6. r(£:) < Mcopeiand<-{£) < 2([logm] -f l)r(£:). 

Proof: Follows from Theorem 11 in [Xia, 2012]. □ 

Theorem 8. For the Copeland°‘ voting rule, there is a polynomial time ^1 — 0 -MoV algorithm whose 

sample complexity is || In 

Proof: Let £ = (V, C) be an instance of a Copeland" election. For every x,y € C, we compute D£{x, y), which 
is an estimate of D£{x,y), within an approximation factor of (0,e'), where e' = |. This can be achieved with an 
error probability at most 5 by sampling || In many votes uniformly at random with replacement (the argument is 
same as the proof of Theorem 3). We define s[{V,x) = \{y G C : y x, D£{y,x) < 2t}\ + a\{y G C : y 
x,D£{y,x) = 2<}|. We also define iiAf(x, y) between a; and y to be the minimum integer f such that, s'_j(V< 

s^{V,y). Let w be the winner of the sampled election, z = argmin^g(;;\^{jjj{i?M(w, x)}, w the winner of £, and 
z = argmin^gpy{^j{i?M(r/;, x)}. Since, D£{x,y) is an approximation of D£{x,y) within a factor of (0,e'), we 
have the following for every candidate x,y G C. 

s[{V, x) — e'n < s[{V, x) < s^iV, x) + s'n 

RM (x, y) — 2e'n < RM (x, y) < RM (x, y) + 2£'n (3) 

Define f (f) = RM{w, z) to be the estimate of r(f). We show the following claim. 
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Claim 1. With the above definitions of w, z, w, and z, we have the following. 

r(f) - 4e'n < f(f) < r(f) + 4e'n 
Proof: Below, we show the upper bound for f (f). 

f (f) = Tmiw, z) < RM{w, z) + 2e'n 

< RAI{w, z) + 2e'n 

< RM{w, z) + As'n 
= r(£:) + 4e'n 

The second inequality follows from the fact that Dg {x, y) is an approximation of Ds {x, y) by a factor of (0, e'). 
The third inequality follows from the definition of z, and the fourth inequality uses inequality 3. Now, we show the 
lower bound for r(f). 

r(f) = RM[w, z) > RM{w, z) — 2e'n 

> RM{w, z) — 4e'n 

> RM{w, z) — 4e'n 
= r(f) — 4e'n 

The third inequality follows from inequality 3 and the fourth inequality follows from the definition of z. □ 


We define M, the estimate of Mcopeiand'^ {£), to be ). The following argument shows that M is a 

(iSirr) ’ (^^-estimate of Mcopeiand<- {£)■ 

M — Mcopeland°‘ {£) 

-f(£’) — Mcopeland‘^{£) 


< 


< 


< 


4(logm 

+ 

1) 

2 log 771 

+ 

3 

4 (log 771 

+ 

1) 

2 log 771 

+ 

3 

4 (log 771 

+ 

1) 

2 log 771 

+ 

3 

2 log 771 

+ 



2 log m + 3 


2 log m + 3 


< 1-0 


1 


j Copeland^ (^) 


^ log m J J 

The second inequality follows from Claim 1 and the third inequality follows from Lemma 6. Analogously, 


; have: 


-Tf(7ope/aTid“ (^) 

= Mcopeland<^{£) " r(g) 

2 log m + 3 

/ »,r 4(logm+l)^^^^ , 16(logm + l)_,^ 

< wdcopeland°^\£) i o 4“ ,, , ^ £ ri 


2 log m + 3 


2 log m + 3 


^ nr /c\ 2(logm + l) 

< tytCopeland^^ \^ ) ^ „ tvlcopeland^ j ' 

2 log m + 3 

2 log rn+\ 

< + STL 


2 log m + 3 


< 1-0 


^ log 771 ^ 

second line follows Claim 1 and the third line follows from Lemma 6. 


1 


Copeland°^ {_£') 4 “ 


The 


□ 
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The approximation factor in Theorem 8 is weak when we have a large number of candidates. The main difficulty 
for showing a better approximation factor for the Copeland" voting rule is to find a polynomial time computable 
quantity (for example, r(f) in Lemma 6) that exhibits tight bounds with margin of victory. We remark that, existence 
of such a quantity will not only imply a better estimation algorithm, but also, a better approximation algorithm (the 
best known approximation factor for finding the margin of victory for the Copeland" voting rule is 0(log m) and it 
uses the quantity r(£^)). However, we remark that Theorem 8 will be useful in applications, for example, post election 
audit and polling, where the number of candidates is often small. 

5 Conclusion 

We have introduced the (c, e, (5)-MoV problem and presented efficient sampling based algorithms for solving it for 
many commonly used voting rules. Besides closing the gap in the sample complexity, an interesting future direction is 
to study how the knowledge of social network structure among the voters impacts sample complexity. Characterizing 
voting rules for which the sample complexity of this problem is independent of m and n is another interesting research 
direction to pursue. 
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